Goto

Collaborating Authors

 data training


18d3a2f3068d6c669dcae19ceca1bc24-Paper-Conference.pdf

Neural Information Processing Systems

Thebrain prepares forlearning evenbefore interacting withtheenvironment, by refining and optimizing its structures through spontaneous neural activity that resembles random noise. However,the mechanism of such aprocess has yet to be understood, and it is unclear whether this process can benefit the algorithm of machine learning.


Pretraining with Random Noise for Fast and Robust Learning without Weight Transport

Neural Information Processing Systems

However, the mechanism of such a process has yet to be understood, and it is unclear whether this process can benefit the algorithm of machine learning.


Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation

Li, Chengyu, Cheng, Debo, Zhang, Guixian, Li, Yi, Zhang, Shichao

arXiv.org Machine Learning

Graph Neural Networks (GNNs) have demonstrated strong performance in graph representation learning across various real-world applications. However, they often produce biased predictions caused by sensitive attributes, such as religion or gender, an issue that has been largely overlooked in existing methods. Recently, numerous studies have focused on reducing biases in GNNs. However, these approaches often rely on training with partial data (e.g., using either node features or graph structure alone), which can enhance fairness but frequently compromises model utility due to the limited utilization of available graph information. To address this tradeoff, we propose an effective strategy to balance fairness and utility in knowledge distillation. Specifically, we introduce FairDTD, a novel Fair representation learning framework built on Dual-Teacher Distillation, leveraging a causal graph model to guide and optimize the design of the distillation process. Specifically, FairDTD employs two fairness-oriented teacher models: a feature teacher and a structure teacher, to facilitate dual distillation, with the student model learning fairness knowledge from the teachers while also leveraging full data to mitigate utility loss. To enhance information transfer, we incorporate graph-level distillation to provide an indirect supplement of graph information during training, as well as a node-specific temperature module to improve the comprehensive transfer of fair knowledge. Experiments on diverse benchmark datasets demonstrate that FairDTD achieves optimal fairness while preserving high model utility, showcasing its effectiveness in fair representation learning for GNNs.


New Directions in Text Classification Research: Maximizing The Performance of Sentiment Classification from Limited Data

Agustian, Surya, Syah, Muhammad Irfan, Fatiara, Nurul, Abdillah, Rahmad

arXiv.org Artificial Intelligence

The stakeholders' needs in sentiment analysis for various issues, whether positive or negative, are speed and accuracy. One new challenge in sentiment analysis tasks is the limited training data, which often leads to suboptimal machine learning models and poor performance on test data. This paper discusses the problem of text classification based on limited training data (300 to 600 samples) into three classes: positive, negative, and neutral. A benchmark dataset is provided for training and testing data on the issue of Kaesang Pangarep's appointment as Chairman of PSI. External data for aggregation and augmentation purposes are provided, consisting of two datasets: the topic of Covid Vaccination sentiment and an open topic. The official score used is the F1-score, which balances precision and recall among the three classes, positive, negative, and neutral. A baseline score is provided as a reference for researchers for unoptimized classification methods. The optimized score is provided as a reference for the target score to be achieved by any proposed method. Both scoring (baseline and optimized) use the SVM method, which is widely reported as the state-of-the-art in conventional machine learning methods. The F1-scores achieved by the baseline and optimized methods are 40.83% and 51.28%, respectively.


Pretraining with Random Noise for Fast and Robust Learning without Weight Transport

Cheon, Jeonghwan, Lee, Sang Wan, Paik, Se-Bum

arXiv.org Artificial Intelligence

The brain prepares for learning even before interacting with the environment, by refining and optimizing its structures through spontaneous neural activity that resembles random noise. However, the mechanism of such a process has yet to be thoroughly understood, and it is unclear whether this process can benefit the algorithm of machine learning. Here, we study this issue using a neural network with a feedback alignment algorithm, demonstrating that pretraining neural networks with random noise increases the learning efficiency as well as generalization abilities without weight transport. First, we found that random noise training modifies forward weights to match backward synaptic feedback, which is necessary for teaching errors by feedback alignment. As a result, a network with pre-aligned weights learns notably faster than a network without random noise training, even reaching a convergence speed comparable to that of a backpropagation algorithm. Sequential training with both random noise and data brings weights closer to synaptic feedback than training solely with data, enabling more precise credit assignment and faster learning. We also found that each readout probability approaches the chance level and that the effective dimensionality of weights decreases in a network pretrained with random noise. This pre-regularization allows the network to learn simple solutions of a low rank, reducing the generalization loss during subsequent training. This also enables the network robustly to generalize a novel, out-of-distribution dataset. Lastly, we confirmed that random noise pretraining reduces the amount of meta-loss, enhancing the network ability to adapt to various tasks. Overall, our results suggest that random noise training with feedback alignment offers a straightforward yet effective method of pretraining that facilitates quick and reliable learning without weight transport.


Algorithmic bias in AI

#artificialintelligence

Algorithmic bias in AI is also defined as machine learning bias, where an algorithm performs systematically and make assumptions in the machine learning process. Bias comes up with different factors which does not contain little design of the algorithm and it under designs by planning with the collected data. It helps in training the model by the bias algorithm. Considering the real-world example, we find usage of algorithmic bias in various places like social media platforms and in the search engine. Sometimes even we face difficult problems with the algorithmic bias in case of its sequence and its performance by making various wrong outcomes.


Get Ready For These Six 2020 Business Intelligence Trends

#artificialintelligence

More and more often, businesses are using data to drive their decisions -- which makes cutting-edge analytics and business intelligence strategies one of the best advantages a company can have. New technologies, especially those driven by artificial intelligence (or AI), are changing how businesses collect and extract usable insights from data. Here are the six trends you should be aware of that will reshape business intelligence in 2020 and throughout the new decade. New data-collection technologies, like internet of things (IoT) devices, are providing businesses with vast banks of minute-to-minute data unlike anything collected before. Matt Turck, an AI and data investor, calls it "the'datafication' of everything" -- as more of the world comes online, it becomes possible to analyze, catalog and turn information into a format analysts, and AI, can break down.


Joining Human & Artificial Intelligence In Malaysia To Build The World's Most Efficient Workforce

Forbes - Tech

Each day, Mark Koh and his team at Kuala Lumpur-based data training & content moderation company Supahands help their clients test the limits of the question'how much data is too much data?' Data drives innovation, but it is only valuable if it is assigned a purpose, and it is only usable if it is organized to serve that purpose. As companies continue to move toward digitization and automation, and more data is at their disposal than ever before, the amount of data requiring cleaning, tagging, and categorizing has exploded. Amazon and Google are two examples of companies that have set themselves apart through data-driven innovation, but many organizations don't have the scale, time or expertise to understand and make use of their data with the speed and accuracy needed to take full advantage of it. That's where Supahands comes in. Headquartered amidst the hustle-and-bustle and congestion of Malaysia's capital city, the Supahands team aims to create a less polluted data universe.